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BENFORD'S LAW, VALUES OF L-FUNCTIONS AND THE 3a; 

PROBLEM 

ALEX V. KONTOROVICH AND STEVEN J. MILLER 



^ Abstract. We show the leading digits of a variety of systems satisfying certain 

conditions follow Benford's Law. For each system proving this involves two 
main ingredients. One is a structure theorem of the limiting distribution, 
specific to the system. The other is a general technique of applying Poisson 
Summation to the limiting distribution. We show the distribution of values of 
L-functions near the central line and (in some sense) the iterates of the 3x + 1 
Problem are Benford. 



d ■ 1. Introduction 

While looking through tables of logarithms in the late 1800s, Newcomb jNew| 
noticed a surprising fact: certain pages were significantly more worn than others. 
' People were referencing numbers whose logarithm started with 1 more frequently 

<*' , than other digits. In 1938 Benford IBenj observed the same digit bias in a wide 

' variety of phenomenon. 

«) , Instead of observing one-ninth (about 11%) of entries having a leading digit 

■ of 1, as one would expect if the digits 1,2,..., 9 were equally likely, over 30% 

of the entries had leading digit 1, and about 70% had leading digit less than 5. 
' Since log^Q 2 « 0.301 and log;^o 5 ~ 0.699, one may speculate that the probability 

of observing a digit less than k is logj^Q k, meaning that the probability of seeing 

a particular digit j is logjo (j + 1) — logioi = logio (-'^ + j) • "^^^^ logarithmic 
phenomenon became known as Benford's Law after his paper containing extensive 
empirical evidence of this distribution in diverse data sets gained popularity. See 
^ ' |Hil | for a description and history, |Hi2IIBBH| for some recent results, and page 255 

of |Knu ' for connections between Benford's law and rounding errors in computer 
calculations. 

In |BBH| it was proved that many dynamical systems are Benford, including most 
power, exponential and rational functions, linearly-dominated systems, and non- 
autonomous dynamical systems. This adds to the ever-growing family of systems 
known or believed to satisfy Benford's Law, such as physical constants, stock market 
indices, tax returns, sums and products of random variables, the factorial function 
and Fibonacci numbers, just to name a few. 

We introduce two new additions to the family, the Riemann zcta function (and 
other iy- functions) and the 3a; + 1 Problem (and other {d,g, ft.)-Maps), though we 
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prove the theorems in sufficient generahty to include other systems. Roughly, the 
distribution of digits of values of L-functions near the critical line and the ratio of 
observed versus predicted values of iterates of the 3a; + 1 Map tend to Benford's 
Law. For exact statements of the results, see Theorem 14.41 and CoroUarv 14.51 for 
i-functions and Theorem 15.31 for the 3a; + 1 Problem. While the best error terms 
just miss proving Benford behavior for i-functions on the critical line, we show 
that the values of the characteristic polynomials of unitary matrices are Benford in 
Appendix^ as these characteristic polynomials are believed to model the values 
of L-function, this and our theoretical results naturally lead to the conjecture that 
values of L-functions on the critical line are Benford. 

A standard method of proving Benford behavior is to show the logarithms of 
the values become equidistributed modulo 1; Benford behavior then follows by 
exponentiation. There are two needed inputs. For both systems the main term 
of the distribution of the logarithms is a Gaussian, which can be shown to be 
equidistributed modulo 1 by Poisson summation. The second ingredient is to control 
the errors in the convergence of the distribution of the logarithms to Gaussians. 
For i-functions this is accomplished by Hejhal's refinement of the error terms (his 
result follows from an analysis of high moments of integrals of log |L(s, /)|), and 
for the 3a; + 1 Problem it involves an analysis of the discrepancy of the sequence 
fclog^ 2 mod 1 (which follows from log^ 2 is of finite type; see below). 

The reader should be aware that the standard notations from number theory 
and probability theory sometimes conflict (for example, u is used to denote the 
real part of a point in the complex plane as well as the standard deviation of a 
distribution); we try and follow common custom as much as possible. We denote the 
Fourier transform (or characteristic function) of / by / (y) = / (a;) e~^'^'^*'(ix. 
Recall g{T) = o(l) means g{T) as T -> c», and ^(T) < h(T) or ^(T) = 
0{h{T)) means there is some constant C such that for all T sufficiently large, 
|ff(r)| < Ch{T). Our proof of the Benford behavior of the 3x + 1 problem uses the 
(irrationality) type of log^ 2 to control the errors; a number a is of type k if k is 
the supremum of all 7 with 



lim ^^q'''^^ min 



P 

a 



0. (1.1) 



By Roth's theorem, every algebraic irrational is of type 1. See for example | HSIIRo| 
for more details. 

2. Benford's Law 

To study leading digits, we use the mantissa function, a generalization of scientific 
notation. Fix a base B > 1 and for a real number a; > define the mantissa function, 
Mb {x) , from the unique representation of x by 

X = Mb {x) ■ B^, with fc e Z and Mb (a;) G [1, B) . (2.1) 

We extend the domain of mantissa to all of C via 

, X fo if a; = 

Mb{x) = I 2.2) 
' \Mb{\x\) if a; 7^0. ^ ' 

We study the mantissa of many different types of processes (discrete, continuous 
and mixed), and it is convenient to be able to use the same language for all. Take 
an ordered total space f2, for example N or M"*", and a (weak notion of) measure /x 
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on n such as the counting measure or Lebesgue measure. For a subset A C Q and 
an element T £ il, denote by At = {uj E A : u; < T} the truncated set. We define 
the probability of A via density in Q: 



Definition 2.1. F (A) = ^lim j^jipj, provided the limit exists 

#{neA: n<T} 



T^oo 

For A C N and fi the counting measure, F (A) = lim ff-i-"^"^ t ^ -^vhile if 



A C and /i is Lebesgue measure then P (A) — ^im . In Appendix 

0we extend our notion of probability to a slightly more general setting, but this 
will do for now. ^ 

For a sequence of real numbers indexed hy fl, X = {xuj}^^ii, and a fixed 
s £ consider the pre-image of mantissa, {tj e il : 1 < Mb{xui) < s}; we 

abbreviate this by {1 < Mb{X) < s}. 

Definition 2.2. A sequence X is said to be Benford (base B ) if for all s £ [1, B), 

v[i<Mb(x)<s^ - logss. (2.3) 

Definition 12. 21 is applicable to the values of a function /, and we say / is Benford 
base B if 

lim l^i^<t<T:l<MBUit))< s) ^ 

T^oc T ^ ^ 

We describe an equivalent condition for Benford behavior which is based on 
equidistribution. Recall 

Definition 2.3. A set A d R is equidistributed modulo 1 if for any [a,h] C [0,1] 
we have 

lim A^({^e^T:x modlG[a,5]}) ^ ^ _ 
T-+00 l^iAr) 
The following two statements are immediate: 

Lemma 2.4. u = v mod 1 if and only if the mantissa of B^ and B^ are the same, 
base B. 

Lemma 2.5. y mod 1 £ [O,log^ s] if and only if B^ has mantissa in [1, s]. 
The following result is a standard way to prove Benford behavior: 

Theorem 2.6. Let Yb = log^ so pointwise y^j^B = log^ l^i^li (^f^d set log^ = 

0. Then Yb is equidistributed modulo 1 if and only if X is Benford base B. 

Proof. By Lemma 12.51 the set {Yb mod 1 £ [O,log^s]} is the same as the set 
{Mb{X) £ [l,s]}. Hence Yb is equidistributed modulo 1 if and only if 

logss = p{Yb mod 1 £ [0,logBs]} = F {Mb(X) £ [l,s]} (2.6) 

if and only if X is Benford base B. □ 

Theorem 12.61 reduces investigations of Benford's Law to equidistribution modulo 

1, which we analyze below. 
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Remark 2.7. The limit in Definition 12.11 often called the natural density, will 
exist for the sets in which we are interested, but need not exist in general. For 
example, if A is the set of positive integers with first digit 1, then n<T} 
oscillates between its lim inf of ^ and its lim sup of | . One can study such sets by 
using instead the analytic density 

Pan (A) = hm ^^-^ , (2.7) 

,s^l+ C(s) 

where C(s) is the Riemann Zeta Function (see A straightforward argument 
using analytic density gives Benford-type probabilities. In particular, Bombicri 
(see [Si], page 76) has noted that the analytic density of primes with first digit 1 
is logj^o 2, and this can easily be generalized to Benford behavior for any first digit. 



3. PoissoN Summation and Equidistribution modulo 1 

We investigate systems Xx converging to a system X with associated loga- 
> 

rithmic processes Yt.b- For example, take some function g : R —> C and let 
X — {g{t)}teR- Then Xt — {5(i)}o<t<T truncations of X, with log-process 
Yt.b = {logs \9{t)\}o<t<T- When there is no ambiguity we drop the dependence 

on B and write just Yt for Yt^b- 

Let f{x) be a fixed probability density with cumulative distribution function 
F {x) = / (i) dt. In our appHcations the probability densities of Ft.b are 
approximately a spread version of / such as fT{x) = (f;). There is, however, 
an error term, and the log-process Yt b has a cumulative distribution function given 

by 

= f[^)+Et(i). (3.1) 

where Ex is an error term. Our goal is to show that, under certain conditions, the 
error term is negligible and fx {x) spreads to make Yt.b equidistributed modulo 1 
as T ^ oo. This will imply that X is Benford base B. 

In our investigations we need the density /, cumulative distribution function Ft 
and errors Et to satisfy certain conditions in order to control the error terms. 

> 

Definition 3.1 (Bcnford-good). Systems Yt.b with cumulative distribution func- 
tions Ft are Benford-good if the Ft satisfy the probability density f satisfies 
sufficient conditions for Poisson Summation (J^^fin) = J^nfi"^))' '^'^'^ there is 
a monotone increasing function h(T) with limT^oo h{T) = oo such that f and Ft 
satisfy 

Condition 1. Small tails: 



Ft (oo) - Ft {Th{T)) = o(l). Ft {~T h{T)) ~ Ft {~oo) = o(l). (3.2) 
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Condition 2. Rapid decay of the characteristic function: 



f{Tk) 



= oil) 



Condition 3. Small truncated translated error: 

£T{a,b) = J2 [ET{b + k)- Eria + k)] 

\k\<Th{T) 

for allO<a<b< 1. 



(3.3) 



(3.4) 



In all our applications / will be a Gaussian, in which case the Poisson Summation 
Formula holds. See for example Da. (pages 14 and 63). 

Condition n asserts that essentially all of the mass lies in [—Th{T),Th{T)]. In 
applications T will be the standard deviation, and this will follow from Central 
Limit type convergence. 

Condition|2is quite weak, and is satisfied in most cases of interest. For example, 
if / is differentiable and /' is integrable (as is the case if / is the Gaussian density), 

then \ f{y)\ < / \fix)\dx = O ( jiy) , which suffices to show S (T) = o(l). 

Condition O is the most difficult to prove for a system, and to our knowledge has 
not previously been analyzed in full detail. It is well known (see j^]) that there 
are some processes (for example, Bernoulli trials) with standard deviation of size 
T where the best attainable estimate is Et[x) = O (^). Errors this large lead to 
£T{a,b)=0{l). 

We now see why these conditions suffice. For [a, 6] C [0, 1), let PT{a,b) denote 
the probability that Yt.b mod 1 G [a,b]. To prove ir.s becomes equidistributed 
modulo 1, we must show that Pria, b] ^ b — a. We would like to argue as follows: 



PT[a,b] 



P^Yt^ mod 1 e [a,6]| 

e [a + fc,6 + fc]} 

feGZ 

Y,(.FTib + k)-FT{a + k)) 



E 

fcGZ 

E 

feez 



f / (^ I ■i' + Erib + k)- Eria + k) 



[Erib + k)- Eria + k)] .(3.5) 



kei 



While the main term can be handled by a straightforward application of Poisson 
Summation, the best pointwise bounds for the error term are not summable over all 
fc G Z. This is why Conditionals necessary, so that we may restrict the summation. 



Theorem 3.2. Assume log-processes Yt.b are Benford-good. Then Yt.b 
where Yb is equidistributed modulo 1. 



Proof. As the Fourier transform converts translation to multiplication, if gx{u) = 
f (^^) then a straightforward calculation shows that gxiw) — e^'^^^^T fiTw) for 
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any fixed x. Our assumptions on / allow us to apply Poisson Summation to g, and 
we find 

E/(^) = E3-(^) - E5i(^) - TY,e^^--n{Tk). (3.6) 

fcez ^ ^ fcez fcGZ fcez 

Let [a, 6] C [0,1]. By Condition [T] and 

Pj,(a,6) = E (i^T(& + fc) --FT(a + fc)) 

|fc|<T/i(T) 

+ O (i^T(oo) - FT{Th{T))) + O {FT{-Th{T)) - i^T(-oo)) 



E 

|fe|<T/i(T) 



0(1) 



E ^ / /(^)d^ + ^T(«,?>)+o(l). (3.7) 

\k\<Th{T) •'a \ J 

By Condition 13 ET{a,h) = o(l); as / is integrable we may return the sum to all 
k (z 1j at a, cost of o(l). The interchange of summation and integration below is 
justified from the decay properties of /. To see this, simply insert absolute values 
in the arguments. Therefore using H3.6|l . 



PT[a,b] = f f 



T 



dx + o(l) 



^ r (E^- ik)]dx + oil) 

^ / (E3^-(^)) rf2: + o(l) 
Vfcez / 

E/(^^) A' 



+ o(l) 



= f{0){b-a) + J2f{Tk) — +o(l). (3.8) 

As / is a probability density, /(O) = 1, and by Condition |21 the sum in (|3.8|) is o(l). 
Therefore 

PT{a,b) = 6-a + o(l), (3.9) 
which completes the proof. □ 

As an immediate consequence, we have: 



Theorem 3.3. Let Xt (the truncation of X ) have corresponding log-process Yt.b- 
Assume the Yt.b are Benford-good. Then X is Benford base B. 

Proof. This follows immediately from Theorems 13.21 and 12.61 □ 

An immediate application of Theorem 13.31 is to processes where the distribution 
of the logarithms is exactly a spreading Gaussian (i.e., there are no errors to sum). 
We describe such a situation below. 
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Recall a Brownian motion (or Wiener process) is a continuous process with 
independent, normally distributed increments. So if is a Brownian motion, then 
Wt — Ws is a random variable having the Gaussian distribution with mean zero 
and variance t — s, and is independent of the random variable Ws — Wu provided 
u < s < t. 

A standard realization of Brownian motion is as the scaled limit of a random 
walk. Let xi, X2, a^s, ... be independent Bernoulli trials (taking the values +1 and 
— 1 with equal probability) and let Sn — ^^11=1 denote the partial sum. Then the 
normalized process 

W^i"^ = -^Snt (3.10) 

(extended to a continuous process by linear interpolation) converges as n ^ oo to 
the Wiener process. See or Chapter 2.4 of |KaSh| for further details. 

A geometric Brownian motion is simply a process Y such that the process log Y 
is a Brownian motion. It was known to Benford that stock market indices empir- 
ically demonstrated this digit bias, and for almost as long these indices have been 
modelled by geometric Brownian motion. Thus Theorem l3.3l imDlies the well-known 
observation that 

Corollary 3.4. A geometric Brownian motion is Benford. 

4. Values of L-Functions 
Consider the Riemann Zcta function 

71—1 p prime 

Initially defined for Re(s) > 1, C(s) has a meromorphic continuation to all of C. 
More generally, one can study an i-function 

n—l p prime j—1 

where the coefficients af(n) have arithmetic significance. Common examples in- 
clude Dirichlet L-functions (where af{n) = x(n) for a Dirichlet character x) and 
elliptic curve L-functions (where af{p) is related to the number of points on the 
elliptic curve modulo p). 

All the L-functions we study satisfy (after suitable renormalization) a functional 
equation relating their value at s to their value at 1 — s. The region < Re(s) < 1 
is called the critical strip, and Re(s) = ^ the critical line. The behavior of L- 
functions in the critical strip, especially on the critical line, is of great interest in 
number theory. The Generalized (or, as some prefer. Grand) Riemann Hypothesis, 
GRH, asserts that the zeros of any "nice" L-function are on the critical line. The 
location of the zeros of C(s) is intimately connected with the error estimates in 
the Prime Number Theorem. The Riemann Zeta function can be expressed as the 
moment of the maximum of a Brownian Excursion, and the distribution of the 
zeros (respectively, values) of L-functions is believed to be connected to that of 
eigenvalues (respectively, values of characteristic polynomials) of random matrix 
ensembles. See [BPYI ICTon. .KaSal IKeSn| for excellent surveys. 
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We investigate the leading digits of L-functions near the critical line, and show 
that the distribution of the digits of their absolute values is Benford (see Theorem 
14 .41 for the precise statement). The starting point of our investigations of values of 
the Riemann zeta function along the critical line s = ^ + it is the log-normal law 
(see IT^ISdT] ): 

M({0<<<T:log|C(i+it)| < yJiloglogr}) ^ ry 

lim — — = ^= / e " '''du. 

(4.3) 

Thus the density of values of log |C (5 + i^) | for t e [0,T] are well approximated by 
a Gaussian with mean zero and standard deviation 



V'T - y^iloglogT + 0(logloglogr). (4.4) 

Such results are often used to investigate small values of +ii) | and gaps 
between zeros. As such, the known error terms are too crude for our purposes. In 
particular, one has (trivially modifying (4.21) of |Hej| or (8) of 0) that 

/i({te[r,2r]:a<log|C(|+it)|<6}) ^ 1 [\^u^/2^ldu+0 

(4.5) 

The main term is Gaussian with increasing variance, precisely what we require 
for equidistribution modulo 1. The error term, however, is too large for pointwise 
evaluation (as we have of the order ipx logV'T intervals [a + n,b + n]). 

Better pointwise error estimates are obtained for many L-functions in |Hej| . 
These estimates are good enough for us to see Benford behavior as T ^ 00 near 
the line Re(s) = i. Explicitly, consider an L- function (or a linear combination of 
I/-functions, though for simplicity of exposition we confine ourselves to the case of 
one I/-function) satisfying 

Definition 4.1 (Good i-Function). We say an L-function is good if it satisfies 
the following properties: 

(1) Euler product: 

n—l p prime j—1 

(2) L{s, f) has a meromorphic continuation to C, is of finite order, and has at 
most finitely many poles (all on the line Re(s) = 1). 

(3) Functional equation: 



e''^G{s)L{sJ) = e-'"G(l-s)L(l-s), (4.7) 
where w G R and 

h 

G{s) = Q'X{T{\,s + ^i{) (4.8) 

1=1 

with Q, Xi > and Re(/ii) > 0. 
(4) For some H > 0, c e C, a; > 2 w;e have 

VM!)^ = Hloglog. + c + of-^V (4.9) 



BENFORD'S LAW, VALUES OF L-FUNCTIONS AND THE 3x + 1 PROBLEM 9 



(5) The af,j{p) are (Ramanuj an- Peters son) tempered: < 1- 

(6) If N{(7,T) is the number of zeros p of L{s) with Re(/9) > a and Im(p) G 
[0,T], then for some [3 > Q we have 

N{cr,T) ^ O ^T^^^l""^) logT^ . (4.10) 

Remark 4.2. There are many families of L-functions which satisfy the above six 
conditions. The last two are the most difficult conditions to verify, as in all cases 
where these are known the first four conditions can be shown to be satisfied. The 
last two conditions are established for many L-functions (for example, see jSellj 
for C(s) and |Luo| for holomorphic Hecke cuspidal forms of full level and even 
weight fc > 0; see Chapter 10 jlK) for more on the subject), and is an immediate 
consequence of GRH. 

We quote a version of the log-normal law with better error terms (see (4.20) from 
|Hej| with a trivial change of variables in the Gaussian integral) ; for the convenience 
of the reader we list where the various parameters in Hejhal's result are defined. The 
error terms will be pointwise summable, and allow us to prove Benford behavior. 

Theorem 4.3 (Hejhal). Let L{s,f) be a good L-function as in Definition \4-l\ and 

• fix S G (0, 1) (^ |Hej| , Lemmas 2 and 3, page 556), g £ (0, 1] ( [Hejl , Lemma 
3, page 556) and k G (1,3] (^ |Hej| , page 560 and (4-18) on page 562); 

• choose (J>\ + 1^ ( |Hej| , page 563) and \ < cr < \ + j^p^ ( |Hej| , page 
562); 

• the variance ^{(t,T) (see |Hej|, Lemma 1, page 566) satisfies 



ip{a,T) = Hlog 



min ( log T, ^ 



0(1); (4.11) 



• choose N = \i^{a,TY\ and y = T^^^'^ ( jHeH , (4.18), page 565). 
Then we have 



f,i{t€[T,2T]:a<\og\L{a + itJ)\<b}) _ 1 



e 



- ° {drr- ['•irm +«".^)-'' . <"^> 

the implied constant depends only on (3 (Condition (6) of Definition \^~l\j , f, 5, g 
and K. 

For our purposes, a satisfactory choice is to take a = \ + -. — and k > 2. Then 

^ log 1 

■<p{<J,T) Hloglogr + 0(l) and 

y(l/3)(l-2cr) _ yiog* T 3(NloglojT+0(l))« _ gjjp 



logi-^T 



3(Hloglogr-|-0(l))« 



« ("3, 

log T 

We now show, in a certain sense, the values of |I/(s, /)| are Benford. While any 
modest cancellation would yield the following result on the critical line, due to our 
error terms for each interval [T, 2T] we must stay slightly to the right of Re(s) = \. 
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Theorem 4.4. Let L{s,f) be a good L-function as in Definition \4.1\ for example 
we may take Q{s). If the GRH and Ramanujan conjectures hold we may take any 
cuspidal automorphic L-function; see also Remark \4-.<H Fix a 5 E (0, 1). For each 
T, let CTT = 5 + i^p^- Then 

,^j{te[T,2T]: 1 < M. (ILK + it, /)!)<.} ^ ^^^^^ ^^^^^^ 

Thus the values of the L-function satisfy Benford's Law in the limit (with the limit 
taken as described above) for any base B. 

Proof. We first prove the claim for base e, and then comment on the changes needed 
for a general base B. Unfortunately the notation from number theory slightly 
conflicts with the standard notation from probability theory of By Theorem 
12.61 it suffices to show that 

,{t^[T^2T]: a<log|L(.. + it,/)|<6} 

Let -ipT — ipi'^TjT) be the variance of the Gaussian in (|4.12|l . which tends to 
infinity with T. The standard deviation is thus \/i'T, and corresponds to what we 
called r in Let ri{x) be the standard normal (mean zero, variance one; 77 plays 
the role of / from as it is standard to denote L-functions by L(s, /), we use r] 

here and in fJHll, and set rj^/f^i^) — "^/^ V ("^z^^) ■ Note rj^^i^) is the density of 
a normal with mean zero and variance ipT- By 14.12|l we have 

Ft{x) = I T]^{x)dx + ET{x), (4.16) 

J —oo 

where Et{x) — 0{ip^^). We must show the logarithms of the absolute values of the 
L-function are Benford-good. As 77 is a Gaussian it satisfies the conditions for the 
Poisson Summation Formula, and the log-process Yt = log \L{a'T + it, /)| satisfies 
(13. 1() . Thus to apply Theorem l3.3l it suffices to show 77, Ft and £t satisfy Gonditions 
^throughlSlfor some monotone increasing function ^.(V't) with lim-r^oo ^(V't) = 00. 
We take h{ijjT) = ^/ log i/jt- 

Gondition n is immediately verified. To show F^/^((X)) — F^y:^{y/ilJTh{^T)) = 
0(1) we use (|4.12ll to conclude the contribution from the error is o(l), and then note 
that the integral of the Gaussian with standard deviation past \/tpT log ipT is 
small (as 77 is the density of the standard normal, this integral is dominated by 



1 



/2Tr 



\x\>^log1pT 



r]{x)dx, (4.17) 



which is 0(1)). Identical arguments show F^jr^{—^/^ijJTh{ipT)) — F^jr^{— 00) = o(l). 
As we are integrating a sizable distance past the standard deviation, it is easy to see 
that the contribution from the Gaussian is small. We do not need the full strength 
of the bounds in H4.12|l : the bounds from (|4.5|l suffice to control the errors. 

Gondition 12 follows from the trivial fact that r]' is integrable. We now show Gon- 
ditionO holds. Here the bounds from H4.5|) just fail. Using those bounds and sum- 
ming over |fc| < y/ipTh^ipT) would yield an error of size O (^\/ipTh{ipT) ■ V'4>t \ _ 



r0T 
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O (log^ '^ Vt). We instead use ^J^, and find for [a, b] C [0, 1] that 



£T{a,b) = 2^ [Erib + k) - Eria + k)] 

|fc|<vWl(V'T) 

\b-a 



-0T V VV^ 



= o(l) (4.18) 

because k > 1, i5 < 1 and ipT ^ log log T. 

As all the conditions of Theorem 13 . 21 are satisfied, we can conclude that 

^vW(«'^) = b-a + o{l). (4.19) 
We have shown that tending to infinity in this manner, the distribution corre- 
sponding to log \L{aT + ii, /)| converges to being equidistributed modulo 1, which 
by Theorem 13 . 31 implies the values of \L{aT + it, f)\ are Benford base e (as always, 
along the specified path converging to the critical line) . 

For a general base B, note log^ x = ■ The effect of changing base is that 
logg \L{aT + ii, /)| converges to a Gaussian with mean zero and variance i^^^-g- • 

^y ip{cFT, T) (instead of mean zero and variance '4'{(Jt, T)). The argument now 
proceeds as before. □ 

Corollary 4.5. Theorem \4-4\ *s valid if instead of intervals [T, 2T] we consider 
intervals [0, T] . 

Proof Let a{T) = (logloglogr)'°s2, We consider the intervals Iq = [0,T/a{T)] 
and 

/, = [T-'T/a{T), TT/a{T)], z e {1, 2, . . . , loglogloglogT}. (4.20) 

We may ignore Iq as it has length o(T). For each interval J^, i > 1, we use (|4.12f) 
and argue as before. We may keep the same values of /3, 6, g, k, ax as before. T and 
y change, which implies tpx = i^i'^TiT) changes; however, the leading term of i/'t 
is still H log log T, and y(i/3)(i-2o-) g^gg^jj^ leads to negligible contributions. As there 
are only log log log log T intervals, we may safely add all the errors. □ 

Remark 4.6. If we stay a fixed distance off the critical line, we do not expect 
Benford behavior. This is because for a fixed cr > i, for C(s) we have a distribution 
function Ga such that 

hm l^{^^[^^n-Ao,Ua^it)\^[aM} ^ ^1) 

Unlike the log- normal law 1)4.5(1 . where the variance increases with T, note here 
there is no increasing variance for fixed a (though of course the variance depends 
on a); see |B Jl IJ Wj for proofs. Thus to see Benford behavior it is essential that as 
T increases our distance to the critical line decreases. 

For investigations on the critical line, one can easily show Benford's Law holds 
for a truncation of the series expansion of log |L(^ -|- it, f)\, where the truncation 
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Figure 1. Distribution of Digits of |C(s)| versus Benford Probabilities 



depends on the height T. See (4.12) of 'HejJ for the relevant version of the log- 
normal law (which has a significantly better error term than H4.12|l 'l. Similarly, 
one can prove statements along these lines for the real and imaginary parts of 
i-functions. 

Numerical investigations also support the conjectured Benford behavior. In Fig- 
ure 1 we plot the percent of first digits of | C ( ^ + | versus the Benford probabilities 
for t = |, fee {0, 1, ... , 65535}, and note the Benford behavior quickly sets in. Of 
course, we believe that this is strong evidence for Benford behavior exactly on the 
critical line, but as they stand, our error terms are too big and our cancellation too 
small to demonstrate this statement. 

It is believed that values of characteristic polynomials of random matrix ensem- 
bles model values of L-functions on the critical line. In Theorem lA.2l of Appendix IXI 
we show that the digit distribution of the values of these characteristic polynomials 
converge to the Benford probabilities (as the size of the matrices tend to infinity), 
providing additional support for the conjecture that L-functions are Benford on the 
critical line. 



5. The 3a; + 1 Problem 

People working on the Syracuse-Kakutani-Hasse-Ulam-IIailstorm-Collatz-(3a; 1)- 
Problem (there have been a few) often refer to two striking anecdotes. One is Erdos' 
comment that "Mathematics is not yet ready for such problems." The other is Kaku- 
tani's communication to Lagarias: "For about a month everybody at Yale worked 
on it, with no result. A similar phenomenon happened when I mentioned it at the 
University of Chicago. A joke was made that this problem was part of a conspiracy 
to slow down mathematical research in the U.S." Coxeter has offered $50 for its 
solution, Erdos $500, and Thwaites, £1000. The problem has been connected to 
holomorphic solutions to functional equations, a Fatou set having no wandering do- 

(I) I ' 

ergodic theory on Z2, undecidable algorithms, and geometric Brownian motion, to 



BENFORD'S LAW, VALUES OF L-FUNCTIONS AND THE 3x + 1 PROBLEM 13 



name a few (see |Lagl| |Lag2| ). We now relate the (3a; + l)-Problem to Benford's 
Law. 



y - T (5-1) 



5.1. The Structure Theorem. If x is a positive odd integer then 3a; + 1 is even, 
so we can find an integer fc > 1 such that 2*"' || (3a; + 1), i.e. so that 

3a; + 1 

is also odd. In this way, we get the (3a; + 1)-Map 

M:xi — >y. (5.2) 

We call the value of k that arises in the definition of y the fc-value of x. Notice 
that y is odd and relatively prime to 3, so the natural domain for iterating M is the 
set n of positive integers prime to 2 and 3. Write 11 = 6N + E, where E = {1, 5} 
is the set of possible congruence classes modulo 6. The total space is = 11, not N 
or R, and the measure is the appropriate counting measure. 

For every integer a; S 11 with < a; < 2^", computers have verified that enough 
iterations of the (3a; + 1)-Map eventually send x to the unique fixed point, 1. The 
natural conjecture asks if the same statement holds for all a; e 11: 

Conjecture 5.1 ((3a; + l)-Conjecture). For every x G 11, there is an integer n such 
that M" (x) = 1. 

Suppose we apply M a total of m times, calling xq = x and Xi = A'P (x), 
i € {1, 2, . . . , to}. For each Xi_i there is a fc-value, say ki, such that 

Xi M(xi_i) = — ' « £ {l,2,...,m}. (5.3) 

We store this information in an ordered TO-tuple (fci, fe, • ■ ■ , fcm), called the w-path 
of X. Let 7m denote the map sending x to its TO-path, 

Tm ■ X {ki,k2,...,km). (5.4) 

The natural question is whether given an TO-tuple of positive integers (fci, fc2, . . . , km), 
there is an integer x whose TO-path is precisely this TO-tuple. If so, we would like 
to classify the set of all such x. In other words, we want to study the inverse map 

Tm ■ 

The answer is given by the Structure Theorem, proved in |KonSi| : for each TO- 
tuple {ki, k2, ■ ■ ■ , km), not only does there exist an x having this m-path, but this 
path is enjoyed by two full arithmetic progressions, x € {oin -I- 5i, -I- 62)^0, 
and we can solve explicitly for at and bi. In fact, ai = 02 = 6 • 2*^1+*^^^ and 
bi < Ui (so the progressions are full; we do not miss any terms at the beginning). 
Moreover, the two progressions fall into the two possible equivalence classes modulo 
6; i.e., {61 mod 6,62 mod 6} = {1,5}. The structure theorem is the key ingredient 
in analyzing the limiting distributions. These will satisfy the conditions of our main 
theorem (Theorem 13. 3f) . and yield Benford's Law. 

Recall (Definition 12. II) that we define the probability of a subset A C 11 by 

provided the limit exists. We say a random variable ^ has geometric distribution 
with parameter ^ (for brevity, geometrically distributed) if P(^ — n) — ^ for 
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n= 1,2,. 



A consequence of the structure theorem is that 

1 



'{x : Jrnix) = (fci 



I ^m)) 



2feiH hfc„ 



n 

i=l 



1 

2^7 



(5.6) 



Both the expectation and variance of a geometrically distributed random variable 
is 2. For a seed xq let Xm = M™{xo) be the m*^ iterate. A natural quantity to 



investigate is 



-, where (|) Xq is the expected value of 



Theorem 5.2 ("KonSi ). The k-values are independent geometrically distributed 
random variables. Further, for any a £ M 



log2 



< a 



V 



< a 



(5.7) 



where Sm is the sum of m geometrically distributed (with parameter ^) i.i.d.r.v. By 
the Central Limit Theorem, the right hand side converges to a Gaussian integral as 
m ^ oo. The paths are also independent, and so the (3a; + 1)-Paths are those of a 
geometric Brownian motion with drift log | . 



We remind the reader that a Brownian motion (and hence a geometric Brownian 
motion) can be realized as the limit of a random walk; the same phenomenon occurs 
here. The drift corresponds to the fact that the expected value is (|) xq, rather 
than just x^. 

It is worth remarking that a consequence of the drift being logj | (which is 
negative) is that it is natural to expect that typical trajectories return to the ori- 
gin. This statement extends completely to (d, 5, /i)-Maps discussed in Appendix 
B. Theorem 15.21 is immediately applicable to investigations in base two (which is 
uninteresting as all first digits are 1). To study the 3a; + 1 Problem in base B, one 



simply multiplies by j^^j-g , as 
or {S 2m) logg 2. 



log2 ^ _ 

log2 B 



log2 B. This replaces Sm — 2m with 



log, B 



5.2. A Tale of Two Limits. The (3a; + l)-system, Xt = {2;i}Q<j<7-, is probably 
not Bcnford for any starting seed a;o as we expect all of the terms to eventually 
be 1. If we stop the sequence after hitting 1 and consider the proportion of terms 
having a given leading digit j, this is a rational number, whereas logj^Q j is not. Of 
course, this rational number should be close to log^^Q j, but it is difficult to quantify 
this proximity since it is easy to find arbitrarily large numbers decaying to 1 after 
even one iteration of the (3a; + l)-map. 

One sense in which Benford behavior can be proved is the same as the sense 
in which (3a; + l)-paths are those of a geometric Brownian motion. We use the 
structure theorem to prove 

Theorem 5.3. Let B be any real number such that log^ 2 is irrational of type 
K < 00; for example, one may take any integer B which is not a perfect power of 2 
(see for a definition of type k and Theorem \B.l\ for a proof of the irrationality 

type of such integers). Then for any [a, b] C [0, 1], 



lim I 

m — >oo 



mod 1 € [a, b] 



(5. 



BENFORD'S LAW, VALUES OF L-FUNCTIONS AND THE 3x + 1 PROBLEM 15 



As (|)™a;o is the expected value of Xm, this implies the distribution of the ratio 
of the actual versus predicted value after m iterates obeys Benford's Law (base B). 



If B ~ 2" for some integer n, in the limit logg 



values Q,— , - — - with 
depending only on n. 



probability, leading to a non-Benford digit bias 



mod 1 takes 



the 



Notice that since probability is defined through density, this is really two highly 
non-interchangeable limits: 



lim I 

m— >oo 



logi 



mod 1 e [a, h] 



lim lim 



# <^ Xo e Ht : logs 



(17 



mod 1 e [a, 6] 



(5.9) 



Though this is completely natural, it is worth remarking for the sake of precision. 
Of course, a good starting seed (one with a long life-span) should give a close 
approximation of Benford behavior, just as it will also be a generic Brownian sample 
path; this is supported by numerical investigations (see ii5.4|) . 

Let ^1,^2, •■ • be independent geometrically distributed random variables with 
P (e. = ^) = ^, " = 1, 2, . . . , and E (CO = 2, Var (6) = 2. Let 5,„ = YZi 6- Let 

Ci = — 2, Sm — TliT^i ~ ^"m, ~ 2to. We know the distribution of log^ 



is the same as that of {Sm ~ 2m) log^ 2 = Sm log^ 2. The proof is complicated by 
the fact that the sum of m geometrically distributed random variables itself has a 
binomial distribution, supported on the integers. This gives a lattice distribution 
for which we cannot obtain sufficient bounds on the error, even by performing an 
Edgeworth expansion and estimating the rate of convergence in the Central Limit 
Theorem. The problem is that the error in missing a lattice point is of size , and 
we need to sum ^Jrnh(m) terms (for some h{m) — > oo). We are able to surmount 
this obstacle by an error analysis of the rate of convergence to equidistribution of 
klogg 2 mod 1. 

5.3. Proof of Theorem 15 .31 To prove Theorem 15 . 31 we first collect some needed 
results. The proof is similar in spirit to Theorem l3.3l with the needed results playing 
a similar role as the three conditions; however, the discreteness of the 3a; -fl problem 
leads to some interesting technical complications, and it is easier to give a similar 
but independent proof than to adjust notation and show Conditions ^ through |21 
are satisfied. 

In the statements below, [a, 6] is an arbitrary sub-interval of [0,1]. By the Central 
Limit Theorem, the distribution of Sm (although it only takes integer values) is 
approximately a Gaussian with standard deviation of size ^/m. Let c G (O, i) and 
set M = m'^. Let 



{m,m + l,...,{t+l)M -1} 



(5.10) 



and C = log^ 2 be an irrational number of type k (see (|l.l|l ^. Soundararajan 
informed us that one does not need log^ 2 to be of finite type for our applications. 
For integer B, if B^ — 2'' > then it is at least 1, and one obtains o{M) instead of 
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0{M^) in (|5.15|l : the advantage of using finite type is we obtain sharper estimates 
on the rate of convergence, as well as being able to handle non-integral bases B. 
Let r]{x) denote the density of the standard normal: 

Vix) = e-^'/\ (5.11) 

We collect some results needed for the proof of Theorem 15 .31 

• From the Central Limit Theorem (see iFfel, Chapter XV): For any k G Z, 

Prob(C ■Sm = C-k) = Prob ' ^ 



m 



We may write o ^"^^^ O ^ ^^-J^^^^ for some monotone increasing g{m) 
which tends to infinity. We use this to approximate the probability of 
Sm = k. For future use, choose any monotone h{m) tending to infinity such 
that h{m) =o{g{m)), h{m) = o (mi/2005-) ^^^^-i/2005y ^s 

M — m"^ with c < i, if c is sufficiently small then such an h exists. 
Let ki, k2 (z le- Then 



In practice this implies that for the I we must study, there is negligible 

variation in the Gaussian for k Ii- 

By Poisson Summation (see page 63 of |Da| ). 



^ 11— — OQ n— — OO 



e"" '''''' = y e"" , CT > 0. (5.14) 



We often take = -^ttj, and use this to calculate the main term (as 
a ^ OO, both sides of (|5.14l) tend to 1). 
• For any e > 0, letting (5 = l + e— i<lwe have 

#{fc e h-.kC mod 1 e [a, b]} = M{b - a) + 0{M^). (5.15) 

The quantification of the equidistribution of kC mod 1 is the key ingredient 
in proving Benford behavior base B (with C = log^ 2). The rate of equidis- 
tribution, given the finiteness of the irrationality type of C, follows from 
the Erdos-Turan Theorem. As this is the key argument in our analysis, we 
provide a sketch of the proof in Appendix 151 see Theorem 3.3 on page 124 
of jKNI for complete details (while the proof given only applies for /q, a 
trivial translation yields the claim for any le). 

Proof of Theorem XBJA We must show that as jti — > oo, for any [a, b] C [0, 1], 

Pm{a,b) = Prob(CS'„i mod 1 e [a,5]) (5.16) 
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tends to 6 — a. We have 

Pmia, b) = Prob(S'„ = k e le : kC mod 1 G [a, b]) 



+ J2 Pi-ob(S',„ ^keIe:kC mod 1 £ [a, b]). (5.17) 

The second sum in (|5.17|l is bounded by 

Prob (S^^k: \k\ > ^^^) . (5.18) 

By the Central Limit Theorem, H5.18|l is o(l). Ahernatively, using the techniques 
below (with [a, b] = [0, 1]), one can show Prob (^\Sm\ < ^"^^/"^ ^ = 1 + o(l), which 
implies (|5.18l) is o(l). As we are not summing (|5.18|) . it is okay to have an error 
here of size (and errors of approximately this size arise if we add or subtract a 
lattice point). Therefore 

P„(a, 5) = Prob(5™ ^kelf-kC mod 1 e [a, 6]) + o(l) 

5Z Pr,^,t{a.b) + o{\). (5.19) 

1^1 < ^m''"' 

The proof is completed by showing the above is & — a + o(l). Consider an interval 
h. By the number of fc G such that fcC mod 1 G [a, 6] is {b~a)M +0{M^), 

(5 < 1. By the probabihty of each such fc is 77 + O (^-^^^^ . We 

now use (|5.13|l to bound the error from evaluating all the 77 (^^^^ at fc = £M and 
find 



(6-a)M 
P„ij[a,b) = - 



m I \ / V V 2to 



+ O . ^V^) + O (m' ■ r, (^)) ; (5.20) 

summing over all \t\ < ^^^j-"^^ gives Pm{cL,b) + o(l). This gives four sums, which 
we must show are & — a + o(l). 

The sums over 1^1 < y^^Minl of the first and fourth pieces of (|5.2U|) are handled 
by Poisson Summation. We have for the first piece that 

(6 - a)M ( IM 
1^ — V 



{b~a)M^/£M\ {b-a)M ^/ m \ 

^ frn \ ^ frn i ^ / m \ ^ frn i 



\t\>- 



As h{m) — *■ 00, the second sum in (|5.21l) is bounded by 



1 e--'/2(™/Af^)rf^ ^ 1 / e-^'^/'du - 0(1). 



kl> ^M''"' y/ 271171 /A'P v2Tr J\u\>h{m) 

(5.22) 
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Using (|5.14(l with cr^ = gives 



{b - a)M f £M 

-T] 



m 



{b-a)Y: e-''-'-' 



O 



0(1) 



+ o(l) (5.23) 



as the final sum over ^ ^ is bounded by a geometric series and M — m!^ with 



c<\. Thus the first piece from (|5.2U|) gives b - 



0(1). 



As the Gaussian is a monotone function (for a; > or a: < 0), a similar argu- 



ment shows the sum over \i\ < 



M 



of the fourth piece of (|5.2U|) contributes 



0{AI^^^) + o(l). It is here that we use CSm is a very special equidistributed se- 
quence modulo 1, namely it is of the form kC mod 1. This allows us to control the 
discrepancy (how many k € Ii give kC mod 1 £ [a, ^'J). 

We must now sum over \£\ < '^'^"^^ the second and third pieces of H5.20|l . For 
the second piece, we have 



E 



M 



\l\<- 



1 — exp 



2£M2 -f M 

277r 



2 \ 1 



As l-^l < ^^^™-^ and M = m!^ with c < i, we have 



2^M 2 -f A'/2 



2m 

Recall we chose him) and c such that 

2^M2 



h{m)M 



1 — exp 

As we chose /i(m) such that h{m) ~ 
fmh{m) M 



h{m)M 



2m 

o (ml/2005 

1 



o(m-i/2005). Therefore 
« m-i/2"05. 

) , the sum in H5.24|l is 

h{m) 



oil), 



(5.24) 



(5.25) 



(5.26) 



(5.27) 



M ^1/2005 ^1/2005 

proving the second piece in H5.20|l is negligible. 

We are left with the sum over \i\ < ^^j^^"^^ of the third piece in 15.2011 . Its 



contribution is 



O 



mh{m) 



M 



O 



him] 



0(1). 



(5.28) 



M y/rng{m) J \g{m) 

Collecting the evaluations of the sums of the four pieces in (|5.20(l . we see that 

P„(a,6) = &-a + o(l), (5.29) 

which completes the proof of Theorem 15.31 if i? ^ 2" (and thus proves Benford 
behavior base 10 because, by Theorem lB.il logj^g 2 has finite irrationality type). 

Consider now the case when i? = 2". As 5*1,1 takes on integer values, the possi- 
ble values modulo 1 for (Sm — 2m) log^ 2 are {0, i, . . . , ^^}- An identical argu- 
ment shows each of these values is equally likely; by determining which intervals 
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[log^ d,\ogg{d + 1)) they lie in, one can determine the (non-Benford) digit bias in 
this case. See also i i5.4l □ 

In Appendix O a generalization of the 3a;+ 1 map is discussed; for such systems, 
one can easily prove the analogue of Theorem 15. 31 

5.4. Numerical Investigations. Theorem l5 . 3l implies that the first digit of , 3?" — 

i 4 j ^0 

will not be Benford in a base i? = 2". As Sm takes on integer values, (5,„ — 
2m) logs 2 is equally hkely to be any of 0, i, ... , We considered 100, 000 seeds 
congruent to 1 modulo 6, starting at 419, 753, 999, 998, 525. We can rapidly analyze 
the behavior of such large numbers by representing each number as an array and 
then performing the required operations (multiplication by 3, addition by 1 and di- 
vision by 2) digit by digit. Taking m — 10, we analyzed the first digits for i? = 4, 8 
and 16. We have (theoretical predictions in parentheses) 



First 
Digit 


1 


2 


3 


4 


5 


6 


7 


Base 4 


50.2% (50.0%) 


49.8% (50.0%) 


0% 


N/A 


N/A 


N/A 


N/A 


Base 8 


33.1% (33.3%) 


33.6% (33.3%) 


0% 


33.3% (33.3%) 


0% 


0% 


0% 



In base 16, we only observe digits 1, 2, 4 and 8; all should occur 25% of the time; 
we observe them with frequencies 25.0%, 25.0%, 25.3% and 24.8%. In base 10, we 
observe 



First Digit 


1 


2 


3 


4 


5 


6 


7 


8 


9 


Observed 


29.8% 


17.9% 


12.1% 


10.0% 


8.5% 


9.8% 


2.4% 


8.7% 


0.9% 


Benford 


30.1% 


17.6% 


12.5% 


9.7% 


7.9% 


6.7% 


5.8% 


5.1% 


4.6% 



The difficulty in performing these experiments is that our theory is that of two 
limits, T ^ 00 and then m ^ 00. We want to choose large seeds xq (at least large 
enough so that after m applications of the 3a; + 1 map we haven't hit 1); however, 
that requires us to examine (at least on a log scale) a large number of a;o- Taking 
larger starting values (say of the order 10^"°) makes it impractical to study enough 
consecutive seeds. In these cases, to approximate the limit as T ^ 00 it is best to 
choose 100,000 seeds from a variety of starting values and average. 

While we cannot yet prove that the iterates of a generic fixed seed are Benford, 
we expect this to be so. The table below records the percent of first digits equal to 
j base 10 for a 100,000 random digit number under the 3x + 1 map (as the 3x + 1 
map involves simple digit operations, we may represent numbers as arrays, and 
the computations are quite fast). We performed two experiments: in the first we 
removed the highest power of 2 in each iteration (799,992 iterates), while in the 
second we had M{x) = 3x + 1 for x odd and | for x even (2, 402, 282 iterates). In 
both, the observed probabilities are extremely close to the Benford predictions (for 
each digit, the corresponding z-statistics range from about —2 to 2). 
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First 


Bcnford 










Uigit 


Probability 


Removing 2 


z-statistic 


Not Removing 2 


z-statistic 


1 


0.3010 


0.3021 


2.00 


0.3012 


0.63 


2 


0.1761 


0.1752 


-2.10 


0.1763 


0.98 


3 


0.1249 


0.1242 


-1.97 


0.1248 


-0.69 


4 


0.0969 


0.0967 


-0.50 


0.0967 


-1.14 


5 


0.0792 


0.0792 


0.03 


0.0792 


-0.06 


6 


0.0670 


0.0671 


0.56 


0.0667 


-1.32 


7 


0.0580 


0.0582 


0.68 


0.0581 


0.89 


8 


0.0512 


0.0513 


0.79 


0.0510 


-0.77 


9 


0.0458 


0.0460 


0.99 


0.0459 


1.02 



We calculated the values for both experiments: it is 12.38 in the first {M{x) = 
^^r-) and 6.60 in the second {M{x) = 3x -I- 1 for x odd and | otherwise). As for 
8 degrees of freedom, a = .05 corresponds to a value of 15.51, and a = .01 
corresponds to 20.09, we do not reject the null hypothesis and our experiments 
support the claim that the iterates of both maps obey Benford's law. 

6. Conclusion and Future Work 

The idea of using Poisson Summation to show certain systems are Benford is 
not new (see for example |Pin| or page 63 of the difficulty is in bounding the 
error terms. Our purpose here is to codify a certain natural set of conditions where 
the Poisson Summation can be executed, and show that interesting systems do 
satisfy these conditions; a natural future project is to determine additional systems 
that can be so analyzed. One of the original goals of the project was to prove 
that the first digits of the terms Xm in the 3a; + 1 Problem are Benford. While 
the techniques of this paper are close to handling this, the structure theorem at 

our disposal makes -r^Ym — the natural quantity to investigate (although numerical 

[4) ^0 

investigations strongly support the claim that for any generic seed, the iterates of 
the 3a; -I- 1 map are Benford); however, we have not fully exploited the structure 
theorem and the geometric Brownian motion, and hope to return to analyzing the 
first digit of Xm at a later time. Similarly, additional analysis of the error terms 
in the expansions and integrations of L-functions may lead to proving Benford 
behavior on the critical line, and not just near it, although our results on values 
of L-functions near the critical line as well as the digits of values of characteristic 
polynomials of random matrix ensembles support the conjectured Benford behavior. 
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Appendix A. Values of Characteristic Polynomials 

Consider the random matrix ensemble of N x N unitary matrices U (with eigen- 
values e'^" ) with respect to Haar measure; the probability density of U is 

= n (A.1) 

Let 

AT 

Z{U,e) = dct{I-Ue-'^) = (l - e'C'"-'')) (A.2) 

n=l 

be the characteristic polynomial of U . The values of characteristic polynomials 
have been shown to be a good model for the values of L-functions. Of interest to us 
are the results in | KeSn| . where an analogue of the log-normal law of L-functions 
fTheorem 14 .31) is shown for random matrix ensembles: as — > 00 the average of 
the absolute value of the characteristic polynomials of unitary matrices is Gaussian. 
Specifically, let pn{x) be the probability density for \og\Z{U,9)\ averaged with 
respect to Haar measure (Equation (36) of jKeSnj ). and set 

Here Q2{N) is the variance, and by Equation (11) of j KeSnj satisfies 

Equation (53) of [KeSnj (and the comment immediately after it) yield 
Theorem A.l (Keating-Snaith). With pM as above, 

PN{x)dx = e-'^'/^dx + O ({logN)-^/^dx) . (A.5) 

V 27r ^ ^ 

In terms oi p^, from (jA.3p we immediately deduce that 



PN{x)dx = 1 e^"'/^Q^Wrfa: + 0(Q2(iV)''rfx) ; (A.6) 

note the pointwise errors are of size one over the square of the variance. It is easy 
to show the conditions of Theorem 13.21 are satisfied. These errors are significantly 
smaller than the number theory analogues, in part due to the additional averaging 
(the formulas here are for averages with respect to Haar measure, whereas in number 
theory we studied one specific L- function). We thus have 

Theorem A.2. As N 00, the distribution of digits of the absolute values of 
the characteristic polynomials of N x N unitary matrices (with respect to Haar 
measure) converges to the Benford probabilities. 

Proof. As the main term is given by a Gaussian, the only difficulty is in verifying 
Conditions n and 121 In our current setting, yQ2(N) is playing the role of T. Let 
h{N)=\ogQ2{N). As 



.jQ2(N)h{N) -, 2, . ^ 

g-.V2Q.W^^ = 1+0(1), (A.7) 
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Condition n is satisfied. For Conditional note Erib + k) — Et{a + k) becomes 
O {Q2{N)-^), and thus 



^ [ENib + k)- ENia + k)] < VQ2iN)h{N)Q2{N)-^ < 



\k\<y/Q2iN)h{N) 



l0gQ2(iV) 
Q2(7V)3/2 • 

(A.8) 

□ 



Remark A. 3. While we beheve the distribution of digits of L-functions on the 
critical line is Benford, our results f Theorem 14 . 41 and Corollary 14.5(1 apply to values 
just off the critical line. Theorem I A. 21 may thus be interpreted as proyiding addi- 
tional support to the conjectured Benford behayior of L-functions on the critical 
line. 

Remark A. 4. In our earlier inyestigations of Benford behayior, we used either 
the counting measure (first N terms of a sequence) or Lebesgue measure (yalues of 
the function at arguments t e [0, A^]), with N ^ oo. We haye an extra ayeraging 
here. We are not looking at the characteristic polynomials of a sequence of unitary 
matrices Un (where Un is N x N). Instead for each A^ we use Haar measure on 
N X N unitary matrices to ayerage the yalues of the characteristic polynomials, and 
then send N oo. The ayeraged characteristic polynomials play an analogous role 
to our L-functions from before. 



Appendix B. Irrationality type of log^ 2 and Equidistribution 



Theorem B.l. Let B be a positive integer not of the form 2" for an integer n. 
Then logg 2 is of finite type. 



Proof. By we must show for some finite k > that 



As 



log 2 



logB 



logij 2 - 



> 



|glog2 -plog^l 



\q\\ogB 



(B.l) 



(B.2) 



it suffices to show \q log 2—p log -B| ^ q . This follows immediately from Theorem 
2 of jBaj , which implies that if aj and Pj are algebraic integers of heights at most 
Aj{> 4) and B{> 4), then if A = log ai + • •• + /?„ log a„ ^ 0, |A| > ^-cniogn'^ 
where d is the degree of the extension of Q generated by the aj and /3j, C — 
(16nd)^°°", r2 = logai • • -logan and fi' = ri/loga„. We take B to be maximum 
oi (3i = q and P2 = —p- (As stated we need ai, a2 > 4; we replace (7log2 — plogB 
with i((7log4 — plogB^)). In our case d = 1, n = 2, ai — 4:,a2 — B^ . As _B is not 



a power of 2, glog4 — plogB 7^ unless p, q 



logj 



> 



0. In particular, 
1 



-A+cniogQ' 



(B.3) 



For i? = 10 we may take k 
would suffice). 



2.3942 X 10 (though almost surely a lower number 

□ 
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We show the connection between the irrationahty type of a and equidistribution 
of na mod 1; see Theorem 3.3 on page 124 of |KN| for complete details. Define the 
discrepancy of a sequence Xn {n < N) by 



1 

N 



sup \N{b - a) - #{?! <N:xn mod 1 e [a, b]}\ . 

a,fc]c[0,l] 



(B.4) 



The Erdos-Turan Theorem (see [KNj . page 112) states that there exists a C such 
that for all m. 



D 



N 




1 ^ 

2-nihx, 

n—1 



(B.5) 



If Xn = na, then the sum on n above is bounded by min ^ 



< min 



2||L||)' 



I sin TTha\ 

where ||a;|| is the distance from x to the nearest integer. If a is of finite type, this 
leads to TTIJH^ ■ ^ '^^ type k, this sum is of size m'*~^+'^, and the claimed 

equidistribution rate follows from taking m — [A^^^^J . 



Appendix C. (d,^, /i)-Maps 

The Benford behavior of 3a; + 1 also occurs in (d, 5, /i)-Maps, defined as follows. 
Consider positive coprime integers d and with g > d>2, and a periodic function 
h (x) satisfying: 

(1) h{x + d) = h{x), 

(2) x + h (x) = mod d, 

(3) 0<\h{x)\<g. 

The map M is defined by the formula 

N gx + h (gx) 

= dk ' ''^■^> 

where k is uniquely chosen so that the result is not divisible by d. Property (2) 
of h guarantees k > 1. The natural domain of this map is the set 11 of positive 
integers not divisible by d and g. Let E be the set of integers between 1 and dg 
that divide neither d nor g, so we can write 11 = dgl^ +E. The size of E can easily 
be calculated: \E\ = {d — 1) {g — 1). In the same way as before, we have m-paths, 
which are the values of k that appear in iterations of M, and we again denote them 
by 7m (a;)- 

The 3a;+l Problem corresponds to g = 3, d = 2, and h (1) = 1, the 3a:— 1 Problem 
corresponds to g ~ 3, d = 2, and h{l) = —1, the 5a; + 1 Problem corresponds to 
g = 5, d = 2, and h{l) = 1, and so on. Similar to Theorem 15. 21 one can show 

Theorem C.l (^KonSi'). The {d, g, h)- Paths are those of a geometric Brownian 
motion with drift \ogg — logd. 

We expect paths to decay for negative drift and escape to infinity for positive 
drift. All results on Benford's Law for the (3x+ l)-Problem, in particular Theorem 
15.31 generalize trivially to all (d, g, /i)-Maps, with the (irrationality) type of log^ d 
the generalization of the (irrationality) type of log^ 2; note Theorem IB. II is easily 
modified to analyze log^ d. 
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